Skip to content

A92: xDS ExtAuthz Support#481

Open
markdroth wants to merge 33 commits into
grpc:masterfrom
markdroth:xds_ext_authz
Open

A92: xDS ExtAuthz Support#481
markdroth wants to merge 33 commits into
grpc:masterfrom
markdroth:xds_ext_authz

Conversation

@markdroth
Copy link
Copy Markdown
Member

No description provided.

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
@markdroth markdroth marked this pull request as ready for review September 18, 2025 22:50
@markdroth markdroth requested a review from dfawley September 18, 2025 22:50
Comment thread A92-xds-ext-authz.md
Comment thread A92-xds-ext-authz.md
Comment thread A92-xds-ext-authz.md
Comment thread A92-xds-ext-authz.md
will be set to the local address of the connection that the request
came in on.
- [principal](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L82):
If TLS is used, this will be set to the server's cert's first URI SAN
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tls package in Go does not provide a way to retrieve the local certificate used in the TLS handshake. We had filed an issue for this with the Go team ages ago, but not much progress there: golang/go#24673

If the configured channel credentials contain a certificate provider though, we would be able to retrieve the local certs from the provider. But the provider could return more than one cert (if the server is serving multiple domain names and expecting the client to specify an SNI during the TLS handshake). Do other languages have a mechanism to retrieve the actual cert used for the handshake?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C-core, we control the TLS code, so I think we'll be able to handle this.

I'm not sure how hard this will be in Java and Go. If we can't support it, I'm open to dropping it from the gRFC.

I'd like input from @ejona86, @dfawley, @matthewstevenson88, and @gtcooke94 on this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is open PR to support this in Go: golang/go#75699. The changes seem very straightforward, but we might need some pushing to get this through since we originally asked for this about 6 years ago.

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
- [disallow_all](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L70)
- [allow_expression](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L75)
- [disallow_expression](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L79)
- [disallow_is_error](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L87)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The envoy docs say when this is true, and if the rules in this list cause a header mutation to be disallowed, then the filter using this configuration will terminate the request with a 500 error.

Does this mean Unavailable for us or Unknown?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the normal HTTP-to-gRPC status mapping, an HTTP 500 status would map to UNKNOWN.

I could also see an argument for using INTERNAL here. I don't feel strongly, but maybe @ejona86 or @dfawley have thoughts on this.

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md

The following server-side metrics will be exported:

| Name | Type | Unit | Labels | Description |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add the data plane RPC method name as a label to the server-side metrics? Or would that be considered a risk for cardinality explosion?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that might make sense. But if we do that, then we probably also need to apply the same filtering as described in A66 for per-call metrics, so that we don't have a cardinality explosion for non-generated method names.

This makes me realize that these are really per-call metrics, but we'd be defining them via non-per-call metrics framework. @ctiller, are we okay with that for C-core performance-wise, given the work we're doing to move collection into C-core instead of in the stats plugins, or do we need to consider alternatives here?

Comment thread A92-xds-ext-authz.md Outdated
@ejona86 ejona86 self-requested a review November 24, 2025 23:29
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Copy link
Copy Markdown
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are essentially all editorial/clarity. So I don't have concerns. I did notice grpc.client_ext_authz is an interesting namespace for the otel metrics, but I think I see why you did it. I didn't want to bikeshed over it.

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
- [method](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L115):
Will always be "POST".
- [header_map](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L143):
The filter will iterate through each header on the data plane RPC.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have headers at this point. Should we say "metadata entries" for clarity?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by "we don't have headers at this point". The filter only sends the ext_authz request when it sees the request headers, so I'm not sure why we wouldn't have them yet,

I've attempted to clarify the wording here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After staring at it for a while, I figured out what I meant.

The filter is running before the transport, so we haven't actually computed HTTP/2 headers yet; we aren't going to necessarily have stuff like :path or grpc-timeout set. Instead, should we say this is just the application-provided grpc metadata? Or do we need to emulate certain headers here.

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
The header name. The entry will be ignored if empty, if not all
lower-case, if length exceeds 16384, if the key is `host`, or if it
starts with `:`.
- [value](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/core/v3/base.proto#L415):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If value is used for a -bin header, what should be the behavior?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my current proposal is that the name of the header dictates which field we look at, so we would never look at the value field for a -bin header.

I'm open to other approaches here if we like something else better. For example, we could instead just say that we always look at both, and raw_value takes precedence over value if both are set. But then we have to define what happens if raw_value is used for a non--bin header and contains illegal characters.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with almost anything; I just thought we need to define what should happen. Blindly using the field, even when empty, seems within reason.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After digging into the Envoy impl a bit and talking with some of the Envoy developers, it turns out that the current state of handling the HeaderValue proto is Envoy is as follows:

  • ext_proc only reads and writes the raw_value field; it never reads or writes the value field.
  • ext_authz always writes the raw_value field, but always reads the value field.

So I'm now proposing the following for gRPC:

  • When writing, we always set the raw_value field, never the value field.
  • When reading, if the raw_value field is unset, we fall back to using the value field for backward compatibility.

I've also clarified the validation rules and the handling of failed validation in side-stream responses, which is different from the config validation case.

I've moved the details of this to A102 (#510) and changed both this and A93 (#484) to refer back to A102.

@kannanjgithub @rishesh007 @eshitachandwani @sauravzg FYI

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Copy link
Copy Markdown
Member Author

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did notice grpc.client_ext_authz is an interesting namespace for the otel metrics, but I think I see why you did it.

I wanted to phrase this in a way that made it clear that the metrics are talking about ext_authz on the gRPC client vs. gRPC server side, as opposed to the client and server of the ext_authz communication itself. (This filter will always be the client side for the ext_authz communication, even if the filter is running on the gRPC server side.)

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
- [method](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L115):
Will always be "POST".
- [header_map](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L143):
The filter will iterate through each header on the data plane RPC.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by "we don't have headers at this point". The filter only sends the ext_authz request when it sees the request headers, so I'm not sure why we wouldn't have them yet,

I've attempted to clarify the wording here.

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
The header name. The entry will be ignored if empty, if not all
lower-case, if length exceeds 16384, if the key is `host`, or if it
starts with `:`.
- [value](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/core/v3/base.proto#L415):
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my current proposal is that the name of the header dictates which field we look at, so we would never look at the value field for a -bin header.

I'm open to other approaches here if we like something else better. For example, we could instead just say that we always look at both, and raw_value takes precedence over value if both are set. But then we have to define what happens if raw_value is used for a non--bin header and contains illegal characters.

Comment thread A92-xds-ext-authz.md Outdated
Comment thread A92-xds-ext-authz.md Outdated
Comment on lines +357 to +359
- [keep_empty_value](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/core/v3/base.proto#L480):
By default, any header mutation that results in a header with an
empty value will cause the header key to be removed. If this field
Copy link
Copy Markdown
Contributor

@sauravzg sauravzg Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to slightly conflict with envoy's implementation and field docs.

https://github.com/envoyproxy/envoy/blob/923e61a41d9da0eb17aa8d1a9d8337f428245095/source/common/http/header_mutation.cc#L29

Roughly, envoy seems to simply treat empty values as no-op while we are trying to delete key when the header key results in "all empty".

I'll continue the implementation following what envoy does.

Comment thread A92-xds-ext-authz.md Outdated
The `decoder_header_mutation_rules` config field controls which header
mutations are allowed. This field is an
[`envoy.config.common.mutation_rules.v3.HeaderMutationRules`](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L47C9-L47C28)
message, which will be used as follows:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure these are compliant with envoy's implementation(assuming we aim to be as much as possible)?

https://github.com/envoyproxy/envoy/blob/011a587518e6fd834a661e172afe0ea50cbb59e0/source/extensions/filters/common/mutation_rules/mutation_rules.cc#L69-L102

The order of precedence seems to be disallow_expr -> allow_expr -> disallow_all

Comment thread A92-xds-ext-authz.md
[Header Mutations](#header-mutations) below.
- body: Ignored; does not apply to gRPC.
- [ok_response](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/external_auth.proto#L134):
- [headers](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/external_auth.proto#L75)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of precedence between headers and headers_to_remove in case of conflicts is not specified(here and in envoy protos).
Envoy handles headers_to_remove at the end.

https://github.com/envoyproxy/envoy/blob/011a587518e6fd834a661e172afe0ea50cbb59e0/source/extensions/filters/http/ext_authz/ext_authz.cc#L695-L853

Comment thread A92-xds-ext-authz.md Outdated
mutations of all headers not matching `disallow_expression` are allowed.
Note that regexes should be checked for validity as part of resource
validation.
- [disallow_is_error](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L87):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there's a bit of ambiguitiy( or a divergence from envoy) in how we want to deal with invalid header keys(":" , "host" or "grpc-" etc.) , when disallow_is_error is true

The header name. The entry will be ignored if empty, if not all
lower-case, if length exceeds 16384, if the key is host, or if it
starts with : or grpc-.

So, if we ignore these entries , I assume we are not supposed to raise an error (i.e. "disallow" is different from "ignore"), which would make it slightly divergent with envoy's implementation which seems to return an error for such "unmodifiable" headers.

https://github.com/envoyproxy/envoy/blob/011a587518e6fd834a661e172afe0ea50cbb59e0/source/extensions/filters/common/mutation_rules/mutation_rules.cc#L70-L73

https://github.com/envoyproxy/envoy/blob/011a587518e6fd834a661e172afe0ea50cbb59e0/source/extensions/filters/common/mutation_rules/mutation_rules.cc#L63-L65

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants